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SPECIFICATION 

Process for Obtaining OHK RNA. Peptides. 
Polypeptides, or Proteins by Recombinant DNA 
Tochnique 

5 

The present invention has as its object a process 
for obtaining DNA. RNA. peptides* polypeptides or 
proteins, through use of transformed host cells 
containing genes capable of expressing these RNAs. 

10 peptides, polypeptides, or proteins; that is to say. by 
utilisation of recombinant DNA technique. 

The invention aims in particular at the production 
of stochastic genes or fragments of stochastic genes 
in a fashion to permit obtaining simultaneously, 

1 5 efter transcription and translation of these genes, a 
very large number <on the order of at least 1 0,000) of 
completely new proteins, in the presence of host 
cells (bacterial or eucaryottc) containing these genes 
respectively capable of expressing these proteins. 

20 and to cdny out thereafter a selection or screen 
among the said clones, in order to determine which 
of them produce^proteins with desired properties, 
for example structural, enxymatic, catalytic, 
antigenic, pharmacologic, or properties of 

25 tiganding, and more generally, chemical, 
biochemical, biological, etc properties. 

The invention also has as Its aim procedures to 
^htain «tPf^t.«r.f»i»>^ QNAor RnA wlthTigliaSie^ 

pfop^rf ittR nnrapiy gngmieal^ niocnepieai^ qr. 

30 biological properties. 

' It is clear, therefore, that the Invention is open to a 
very large number of applications in very many 
areas of science, industry and medicine. 
The process for production of peptides or 

35 polypeptides according to the invention is 

characterized in that one produces simultaneously, 
in the same medium, genes which are at least 
partially composed of synthetic stochastic 
polynucleotides, that one introduces the genes thus 

40 obtained into host cells, that one cultivates 
simultaneously the independent clones of the 
transformed host cells containing these genes in 
such a manner so as to done the stochastic genes 
and to obtain the production of the proteins 

45 expressed by eechofthese stochastic genes, that 
one carries out selection and/or screening of the 
• clones of transformed host cells in a m»nn9r to 
identify those clones producing peptides or 
polypeptides having at least one desired activity. 

50 that one thereafter isolates the clones thus identified 
and that one cultivates them to produce at least one 
peptide or polypeptide having the said property. 

In a first mode of carrying out this process, 
stochastic genes are produced by stochastic 

55 copolymerization of the four kinds of 

deoxyphosphonucleotldes. A, C. 6 end T from the 
two ends of an initially linearized expression vector, 
followed by formation of cohesive ends In such a 
fashion as to form a stochastic first strand of DNA 

60 constituted by a molecule of expn»ssion vector 
possessing two stochastic sequences whose 3' ends 
are complementary, followed by the synthesis of the 
second strand of the stochastic DNA. 
In a second mode of carrying out this process. 



of oligonucleotides without cohesive ends, in a 
manner to form fragments of stochastic DNA, 
followed by ligation of these fragments to a 
previously linearized expression vector. 
70 The expression vector can be a plasmid. notably a 
bacteria) plasmid. Excellent results have been 
obtained using the plasmid pUC8 as the expression 
vector.' 

The expression vector can also be viral DNA or a 
75 hybrid of plasmid and viral DNA. 

The host cells can be prokaryotic cells such as HB 
101 and C 600. or eukaryotic cells. 

When utilizing the procedure eccording to the 
second mode mentioned above, it is possible to 
80 utilize oligonucleotides which form a group of 
palindromic octamers. 

Particutarty good results are obtained by utilizing 
the following group of palindromic octamers: 

85 5 GGAATTCC3' 

S' GGTCGACC3' 
5'CAAGCTTG3' 
£'CCATATGG3' 
6'CATCGATG3' 

90 

h is also possible to tise oligonucleotides which 
form a group of palindromic heptamers. 

Very good results are obtained utilizing the 
following group of palindromic heptamers: 

95 

5*XTCGCGA3' 
5'XCTGCAG3' 
5*RGGTACC3' 

100 whereX« A.G.CorT,andR«AorT 

According to a method to utilize these procedures 
which is particulariy advantageous, one isolates and 
purifies the transforming DNA of the ptasmids from 
a culture of independent clones of the transformed 

105 host cells obtained by following the procedures 
above, then the purified DNA is cut by at least one 
restriction enzyme corresponding to specific 
enzymatic cutting site present in the palindromic 
octamers or heptamers but absent from the 

1 10 expression vector yyhich was utilized: this cutting is 
followed by inactivation of the restriction enzyme, 
then one simultaneously treats the ensemble of 
linearized stochastic DNA fragments thus obtained 
with T4 DNA ligase. in such a manner to create a 

115 new ensemble of DNA containing new stochastic 
sequences, this new ensemble can therefore contain 
a number of stochastic genes larger than the 
numt>er of genes in the initial ensemble. One then 
utilizes this new ensemble of transforming DNA to 

120 transform the host cells and clone these genes, and 
finally utilizes screening and/or selection and 
isolates the new clones of transformed host cells 
and finally these are cultivated to produce at least 
one peptide or polypeptide, for example, a new 

125 protein. 

The property serving as the criterion for selection 
of the clones of host cells can be the capacity of the 
peptides or polypeptides, produced by a given 
clone, to catalyse a given chemical reaction. 

i'irt ♦nf thP nf eduction of several 
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peptides and/or polypeptides, the said property can 
be the capacity to catalyse a sequence of reactions 
leading from an initial group of chemical 
compounds to at least one target compound. 

5 With the aim of producing an ensemble 
constituted by a plurality of peptides and 
polypeptides which are refiexively autocatalytic, the 
said property can be the capacity to catalyse the 
synthesis of the same ensemble from amino acids 

10 and/or oligopeptides in an appropriate milieu. 
The said property can also be the capacity to 
modify selectively the biologicat or chemical 
properties of a given compound* for example the 
capacity to selectively modify the catalytic activity of 

15 a polypeptide. 

The said property can also be the capacity to 
stimulate, inhibit, or modify at least one biological 
function of at least one biologically active 
compound, chosen, for example, among the 

20 hormones, neurotransmitters, adhesion factors, 
growth factors and specific regulators of ONA 
replication and/or transcription and/or translation of 
RNA. 

JThe said property can eoually be the capacity of. 
25 the peptide or polvneptide to bind to a given Uganda 
The invention also has as its object the use of the 
peptide or polypeptide obtained by the process 
specified above, for the detection and/or the titration 
ofaligand. 

30 According to a particularly advantageous mode of 
carrying out the invention, the criterion for selection 
of the clones of transformed host cells is the 
capacity of these peptides or polypeptides to 
simulate or modify the effects of a biologically 

35 active molecule, for example, a protein, and 

screening and/or selection for clones of transformed 
host ceils produdng at least one peptide or 
polypeptide having this property, is carried out by 
preparing antibodies against the active molecule, 

40 then utilizing these antibodies after their 

purification, to identify the clones containing this 
peptide or polypeptide, then by cultivating the 
clones thus identified, separating and purifying the 
peptide or polypeptide produced by these clones, 

45 and finally by submitting the peptide or polypeptide 
to an in vitro assay to verify that it has the capacity 
to simulate or modify the effects of the said 
molecule. . ' 

According to another mode of carrying out the 

50 process according to the invention, the property 
serving as the criterion of selection is that of having 
at least one epitope similar to one of the epitopes of 
e given antigen. 
The invention carries over to obtaining 

55 polypeptides by the process specified above and 
utiittable as chemotherapeutically active 
sut>stances. 

In particular, in the case where the said antigen is 
EGF, the invention permits obtaining polypeptides 
60 usable for chemotherapeutic treatment of 
epitheliomas. 

According to a variant of the procedure, one 
identifies jnd Isolat es the clones of trans tennad 
RoSTcells producing p eptides or p plyBegtldes 



chromatography against antibodies corresDonding_ 
to a protein express ed bv the natural part of thu 
DNA hybrid. 

For example, in the case where the natural part of 

70 the hybrid DNA contains a gene expressing 

gatactosidase, one can advantageously identify and 
isolates the said clones of transformed host cells by 
affinity chromatography against anti-p- 
galactosidase antibodies. 

75 After expression and purification of hybrid 
peptides or polypeptides, one can separate and 
isolate their novel parts. 

The invention also applies to a use of the process 
specified above for the preparation of a vaccine; the 

80 application is characterized by the fact that 
antibodies against the pathogenic agent are 
isolated, for example antibodies formed after 
injection of the pethogenic agent in the body of an 
animal capable of forming antibodies against this 

85 agent, and these antibodies are used to identify the 
donee producing at least one protein having at least 
one epitope similar to one of the epitopes of the 
pathogenic agent, the transformed host cell 
corresponding to these clones are cultured to 

90 produce these proteins, this protein is isolated and 
purified from the clones of cells, then this protein is 
used for the production of a vaccine against the 
pathogenic agent. 
For example In order to prepare an anti-HVB 

95 vaccine, one can extract and purify at least one 
capside protein of the HVB virus, inject this protein 
into an animal capable of forming antibodies 
against this protein having at least one epitope 
similar to one of the epitopes of the HVB virus, then 

100 cultivate the clones of transformed host cells 
corresponding to these clones in a manner to 
produce this protein, isolate and purify the protein 
from culture of these clones of cells and utilize the 
protein for the production of an anti-HVB vaccine. 

105 According to an advantageous mode of carrying 
out the process acconiing to the invention, the host 
cells consist In bacteria such as Escherichia coli 
whose genome contains neither the natural gene 
expressing 0-galactosidase, nor the EBG gene, that 

110 is to say, Z', EBG' E cof i. The transformed cells are 
cultured in the presence of X gal and the indicator 
IPTG in the mBdium, BTyd cells positive for 3- 
gelactosidase functions are detected; thereafter, the 
transforming DNA is transplanted into an 

115 appropriate clone of host cells for large scale culture 
to produce at least one peptide or polypeptide. 
Tbn prnpnrtY nnrving ay rritf riP" forsefgetion 

capacity of the polvoeptides or proteins produced 
120 by the culture of these clones to bind to a given , 
jcompoun d. 

Yhis compound can be in particular chosen 
advantageously among peptides, polypeptides and 
proteins, notably among proteins regulating the 
125 transcription activity of DNA. 

On fhft oth^ hanH thA romnound can alSO be, 
chosen among DNA and RNA sequences. 

The invention nas also as its object those proteins 
which are obtained in the case where the property 
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transformed host cells consist in the capacitY of 
these proteins to bind to regulatory proteins 
eontrolling transcription activity of the ONA, or else 
to DNA and RNA sequences. 
5 The invention has. in addition* as an object the 
use of a protein which is obtained in the first 
particular case above mentioned, as a cis regulatory 
sequence controlling replication or transcription of a 
neighboring DNA sequence. 
10 On the other hand, the aim of the invention also 
includes utilization of proteins obtained in the 
second case mentioned to modify the properties of 
transcription or replication of a sequence of DNA, in 
a cell containing the sequence of DNAr and 
1 5 exp ressing this protein. 

""^^The invention has as its object as well a process of 
production of DNA, characterized bv sfmult aneous 
production in the same medium. of_qenes at least, 
partially composed of stochastic synthetic 
2(1 polynucleotides , in that the genes thus obtained are 
introduced into host cells to produce an ensemble of 
transformed host cellSr in that screening and/or 
selection on this ensemble is carried out to identify 
those host cells containing in their genome 
2^ stochastic sequences of DNA having at least one 
1 desired property, and finally, in that the DNA from 
Lthe clones of host cells thus identified is isolated. 
^ The invention further has as its object a procedure 
I to produce RNA, characterized by simultaneous 
ao production in the same medium, of genes at least 
T psrtially composed of stochastic synthetic 
t polynucleotides, in that the genes thus obtained are 
I introduced Into host cells to produce an ensemble of 
\ transformed host cells, in that the host cells so 
3^ produced are cultivated simultaneously, and 
screening and/or selection of this ensemble is 
carried out in a manner to identifv those host cell s 
containing stochastic sequences of RNA leaving rt 
least one desired pfffp^rty- and in th at tha RNA h 



4C isolated from the host cells thus identified. 
^ Jhe seid propertv cqp t )p t\\o cap acit y to bind a 
given compound, which might be for example a 
peptide or polypeptide or protein, or also the 
capacity to catalyse a g ivan ehamical raaction. or the 
dS capacity to be a transfer RNA., 

Now the process according to the Invention will 
t>e described tn more details, as well as some of its 
applications, with reference to non limitative 
embodiments. 
50 First vire shall describe particulariy useful 
procedures to carry out the synthesis of stochastic 
genes, and the introduction of those genes in 
bacterie to produce clones of transformed bacteria. 

55 0 Direct Synthesis on an Expression Vector, 
a) Linearization of the Vector 

30 micrograms, that is, approximately 10*' 
molecules of the pUC8 expression vector are 
linearized by incubation for 2 hours at 37*C with 100 

GO units of the PstI restriction enzyme in a volume off 
300 1 of the eppropriate standard buffer. The 
linearized vector Is treated with phenol<hloroform 
then precipitated in ethanol. talcen up in volume of 
30 1 and loaded onto a 0.6% agarose gel in standard 

— — ... . ... a rioMnf^^V/rm for 



three hours, the linearized vector is electro-eluted. 
precipitated in ethanol, and talcen up in 30 1 of water. 

b) Stochastic Synthesis Using the Enzyme Terminal 
70 Transferase (TdT) 

30 ug of the linearized vector ere reacted for 2 
hours at 3rC with 30 units of TdT in 300 1 of the 
appropriate buffer, in the presence of 1 mM dGTP, 1 
mM dCTP, 0.3 mM dTTP and 1 mM dATP. The lower 

75 concentration of dTTP is chosen in order to reduce 
the frequency of "stop" codons In the 
corresponding messenger RNA. A simitar result, 
although somewhat less favorable, can be obtained 
l>y utilizing e lower concentration for dATP than for 

80 the other desoxynucleotide triphosphates. The 
prog ress of the polymerization on the 3' extremity 
of the PstI sites is followed by analysis on a gel of 
aliquotes taken during the course of the reaction. 
When the reactio n attains or passes a mean value, 

85 . of 300 nucleotides added per 3' extremity , it is 
stopped and the free nucleotides are separated from 
the polymer by differential precipitation or by 
passage over a column containing a molecular sieve 
such as Biogal P$0. After concentration by 

90 precipitation in ethanol, the polymeres are 

subjected to a further polymerization with TdT, first 
in the presence of dATP, then in the presence of 
dTTP. These last two reactions are separated by a 
filtration on a gel and are carried out for short 

95 intervals (30 seconds to3 minutes) in order to add 
sequentially t0--30 A followed by 10-30 T to the 3' 
ends of the polymers. 

c) Synthesis of the Second Strand of Stochastic DNA 
1 00 Each molecule of vector possesses, at the end of 

the preceding operation, two stochastic sequences 
whose 3' ends are complementary. The mneture of 
polymers is therefore incubated In conditions 
favoring hybridization of the complementary 

105 extremities (150 mM NaCt, 10 mM Tris*HCl, pH 7.6, 1 
mM EDTA at 65* for 10 minutes, followed by 
lowering the temperature to 22*C at a rate of 3 to 4* 
C per hour. The hybridized polymers are then 
reacted with 60 units of the large f regment (Klenow) 

1 1 0 of polymerase 1, in the presence of the four 
nucleotide triphosphates (200 mM) at 4*C for two 
hours. This step eccomplishes the synthesis of the 
second strand from the 3' ends of the hybrid 
polymers. The molecules which result from this 

1 1 5 direct synthesis starting from linearized vector are 
thereafter utilized to transform competent cells. 

^^Trensformation of Competent Oonns 

100 to 200 ml of competent HB101 of C600 at a 

120 concentration of 10^° cells/ml, are incubated with 
the stochastic DNA preparation (from above) in the 
presence of 6 mM CaCI,. 6 mM Tris-HCI pHB. 6 mM 
MgCla for 30 minutes at 0*C. A temperature shock of 
3 minutes at 37*C is imposed on the mixture, 

125 followed by the addition of 400 to 800 ml of NZY 
culture medium, without antibiotics. The 
trensformed culture Is incubeted at 37* for 60 
minutes, then diluted to 10 litres by addition of NZY 
medium containing 40 pg/ml of ampicillin. After 

1 30 3—5 hours of incubation at 37"C, the amplified 
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culture ts centrtfuged, and the pellet of transformed 
cells is lyophilysed and stored at -TCTC. Such a 
culture contains 3x 1 0^ to 1 0' independent 
transformants. each containing a unique stochastic 
5 gene inserted into the expression vector. 

11) Synthesis of Stochastic Genes Starting From 
Oligonucleotides Without Cohesive Ends. 
This procedure is based on the fact that 

1 0 polymerization of judiciously chosen palindromic 
oligonucleotides permits construction of stochastic 
genes which have no "stop" codon in any of the six 
possible reading frames, while at the same time 
assuring a balanced representation of triplets 

15 specifying all amino acids. Further, and to avoid a 
repotitton of sequence motifs in the proteins which 
result the oligonucleotides can contain a number of 
bases which is not a multiple of three. The example 
which follows describes the use of one of the 

20 possible combinations which fulfil these criteria: 

a) Choice of a Group of Octamers 
The group of oligonucleotides following: 

25 S'GGAATTCCr 
5' GGTCGACC3' 
5' CAAGCTTG3' 
5' CCATATGG 3' 
S'CATCGATG3' 

30 

is composed of 5 palindromes (thus self 
complementary sequences) where it is easy to verify 
that their stochastic polymerization does not 
generate any "stop" codons, and specifies all the 

35 amino acids. 

Obviously, one can utilize other group of 
palindromic octamers which do not generate any 
"stop" codons and specify all the amino acids found 
in polypeptides. Clearly, it is also possible to utilize 

40 non palindromic groups of octamers, or other 
oligomers, under the condition that their 
oomplemems forming double stranded DNA are 
also used. 

45 b) Assembly of a Stochastic Gene From a Group of 
Octamers 

A mixture containing S mq each of the 
oligonucleotides indicated above (previously 
phosphorylated at the 5' position t>y a standard 

SO procedure) is reacted In a 100 ul volume containing 
1 mM ATP, 10% polyethyleneglycol, and 100 units 
of T4 DNA ligase in the appropriate buffer at 13^ for 
six hours. This step carries out the stochastic 
polymerization of the oligomers in the double 

55 stranded state and without cohesive ends. The 
resulting polymers are isolated by passage over a 
molecular sieve (Biogel P60) recovering those with 
20 to 1 00 oligomers. After concentration, this 
fraction is again submitted to catalysis or 

€0 polymerization by T4 DNA ligase under the 
conditions described above. Thereafter, as 
described above, those polymers which have 
assembled at least 100 oligomers are isolated. 



c) Preparation Of The Host Plasmid 

The pUC8 expression vector is linearized bySmal 
enzyme in the appropriate buffer, as described 
above. The vector linearized by Smal does not have 

70 cohesive ends. Thus the linearized vector is treated 
by calf intestine alkaline phosphatase (CIP) at a level 
of one unit per microgram of vector in the 
apppropriate buffer, at 37^ for 30 minutes. The CIP 
enzyme is thereafter inactivated by two successive 

75 extractions with phenol-chloroform. The linearized 
and dephosphoryiated vector is precipitated in 
ethanol, then redissolved in water at 1 mg/ml. 

d) Ligation of Stochastic Genes To The Vector 

80 Equimolar quantities of vector and polymers are 
mixed and incubated in the presence of 1000 units 
of T4 DNA iigase, 1 mM ATP, 10% polyethylene 
glycol, in the appropriate buffer, for 12 hours at 
13X. This step ligates the stochastic polymers in the 

85 expression vector and forms double stranded 
circular molecules which are, therefore, capable of 
transforming. 

Transformation of Competent Clones 
90 Transformation of competent clones is carried out 
in the manner previously descritMd. 

Ill) Assembly of Stochastic Genes Starting From A 
Group Of Heptamers 
95 This procedure differs from that just discussed in 
that it utilizes palindromic heptamers which have 
variable cohesive ends, in place of stochastic 
sequences containing a smaller number of identical 
motifs. 

100 

a) Choice of a Group of Heptamers 

It is possible, as an example, to use the following 
three palindromic heptamers: 

105 5'XTCGCGA3' 
5'XCTGCAG3' 
S' RGGTACC3' 

where X-A, G. C, or T and R- A or T, and where 
1 10 polymerization cannot generate any "stop" codons 
and forms triplets specifying all the amino acids. 

Cleariy it is possible to use other groups of 
heptamers fulfilling these same conditions. 

115 b) Polymerization of a Group of Heptamers 

This polymerization is carried out exactly in the 
fashion described above for octamers. 

c) Elimination Of Cohesive Extremities 
1 20 The polymers thus obtained have one unpaired 
base on their two 5' extremities. Thus, it is 
necessary to add the complementary base to the 
corresponding 3' extremities. This is carried out as 
follows: 10 micrograms of the double stranded 
125 polymers are reacted with 10 units of the Klenow 
enzyme, in the presence of the four 
deoxynucleotidephosphates (200 mM) in a volume 
of 100 III. at 4«C, for GO minutes. The enzyme is 
inactivated by phenol choloroform extraction, and 

4<9n .b 1.,..^.. ^r. rl«9neoH the rocirtupt f rec 



5 



GB 2 183 661 A 5 



nudeotfdes by differential precipitation. The 
polymers are then ligated to the host plasmid 
(previously linearised and dephosphorylated) by 
following the procedures described above. 
S ft is to be noted that the two last procedures which 
were described utilize palindromic octemers or 
heptamers which constitute specific sites of certain 
restriction enzymes. These sites are absent, for the 
most part, from the pUC8 expression vector. Thus, It 
1 0 is possible to augment considerably the complexity 
of an initial preparation of stochastic genes by 
proceeding in the following way: the plasmid DNA 
derived from the culture of 10^ independent 
transformants obtained by one of the two last 
1 5 procedures described above, is isolated. After this 
DNA is purified, it is partially digested by Clal 
restriction enzyme (procedure 10 or by the PstI 
restriction enzyme (procedure 111). After tnactivation 
of the enzyme, partially digested DNA is treated with 
20 T4 DNA ligase, which has the effect of creating a 
very large numt>er of new sequences, while 
conserving the fundamental properties of the initial 
sequences. This new ensemble of stochastic 
sequences can then be used to transform competent 
25 cells. In addition, the stochastic genes cloned by 
procedure 11 and III can be excised intact from the 
pUC8 expression vector by titilizing restriction sites 
belonging to the cloning vector and not represented 
in the stochastic DNA sequences. 
30r f^ecombination within the stochestic genes 
generated by the two procedures just described, 
which result from the internal homology due to the 
recurrent molecular motifs, is an important 
additional method to achieve in vivo mutagenesis of 
35| the coding sequences. This results in an 

augmentation of the number of new genes which 
can be examined. 

Finally, for all the procedures to generate novel 
synthetic genes. It is possible to use a number of 
40 common techniques to modify genes in vivo or in 
vitro, such as a change of reading frame, inversion 
of sequences with respect to their promotor, point 
mutations, or utilization of host cells expressing one 
or several suppressor tRNAs. 
45 In considering the above description. It is dear 
that it is possible t o construct in vitro, an extremely, 
targe number (for example greater than a billion) 
different genes, by enzymetic polymerization of 
nucleotides or of oligonucleotides. This 
50 polymerization is carried out in a stochastic manrter, 
as determined by the respective concentrations of 
the nucleotides or oligonucleotides present In the 
reaction mixture. 
As indicated above, two methods can be utilized 
55 to done such genes (or coding sequences): the 
polymerization can be carried out directly on a 
cloning expression vector, which was previously 
tineartzed; or it is possible to proceed sequentially 
to the polymerization then the ligation of the 
60 polymers to the expression vector. 

I" thf !wn ''^f Thft Pf '^ep jg transformation gf 
Jransfection of comogtent faactcfiiil cwl|* fftf Cffli* 
culture). This Steg^ constitutes f inning tha gtQf;!^^ffy^|g 
^genes in living cells where thev are indefmitelv 
fi^ nronaoaied ano expressed. 



Clearly, in addition to the procedures which were 
described above, it is feasible to use all other 
methods which are appropriate for the synthesis of 
stochastic sequences. In psrticulsr. it is possible to 

70 carry out polymerization, by biochemical means, of 
single stranded oligomers of DNA or RNA obtained 
by chemical synthesis, then treat these segments of 
DNA or.RNA by established procedures to generate 
double stranded DNA (cDNA) in order to clone such 

75 genes. 

Screening Or Selection Of Clones Of Transformed 
Host Cells 

The further atep of the procedure according to the 
SO invention consists in examining the transformed or 
transfected celt ^ by selection or screening, in order 
to isolate one or several cells whose transforming or 

transcription product (P NA) nr translation product 
85 (protein) havinfji dtt^intd pmparty These properties 
can be. for example, enzymatic, functional, or 
structural. 

One of the most important aspects of the process, 
according to the invention, is tt^y jt p^yrmits the _ 
90 simultaneous sereeninff nr aalactinn of an 

exploitabla pr oriuet IRN A or prnt ain> and thi^ gjina 

which produff— thmf pwwfifqf in addition, the DNA 
synthetized and doned as described, can be 
selected or screened in order to isolate sequences of 
95 DNA constituting products in themselves, having 
exploitable biochemical properties. 

We shall now describe, as non-limitating 
examples, preferred procedures for screening or 
selection of clones of trensformed ceils such that the 

1 00 novel proteins are of interest from the point of view 
of Industrial or medical applications. 

One of these procedures rest in the idea of 
producing, or obtaining polyclonal or monoclonal 
antibodies, by established techniques, directed 

105 against a protein or another type of molecule of 
biochemical or medical interest, where that 
molecule is, or has been rendered, immunogenic, 
and thereafter using these antibodies as probes to 
identify among the very large number of clones 

1 10 transformed by stochastic genes, those whose 
protein react with these antibodies. This reaction is 
a result of a structural homology which exists 
between the polypeptide synthesized by the 
stochastic gene and the initial molecule. It is 

115 possible In this way to isolate numbers of novel 
proteins which beheve as epitopes or antigenic 
determinants on the initial molecule. Such novel 
proteins are liable to simulate, stimulate, modulate, 
or block the effect of the initial molecule, h will be 

1 20 dear that this means of selection or screening may 
itself have very many pharmacologic and 
biochemical applications. Below we describe, as a 
non limiting example, this first mode of operation in 
a concrete case: 

125 EGF. (epidermal growth factor) is a small protein 
present in the blood, whose role is to stimulate the 
growth of epithelial cells. This effect is obtained by 
the interaction of EOF with a specific receptor 
sKuated in the membrane of epithelial cells. 

130 Antibodies directed egatnst EGf are prepared by 
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injecting animals with EGF coupled to KLH (keyhole 
Hnnpet hennocyanin) to augment the 
jmmunogenicicy of the EGF. The «nti-EGF 
antibodies of the immunized animals are purified, 
5 for example, by passage over an affinity column, 
where the ligand is EGF or a synthetic peptide 
corresponding to a fragment of EGF. The purified 
anti-EGF antibodies are then used as probes to 
screen a large number of bacterial clones lysed by 
1 0 cloroform, and on a solid support The anti-EGF 
antibodies bind those stochastic peptides or 
proteins whose epitopes resemble those of the 
initial antigen. The clones containing such peptides 
or proteins are shown by autoradiography after 
1 5 incubation of the solid support with radioactive 
protein A, or after incubation with a radioactive anti- 
antibody antibody. 

These steps Identify those dones, each of which 
contains one protein (and its gene) reacting with the 
20 screening antibody .Jt is feasible to screen among a 
very large number of colonies of bacterial celis or 
viral olaQues (for example, on thy prder or ■ 
j'000'000) and it is feasible to detect extremely. 
small quantities, on the order of 1 nanogram, of 
25 protein pfoduct. Tharqafter. tfia Identified clones aff 
^itiiffij and thf prmn-mf rf ptected are purifiedln 
xonven^ipnai ways . These proteins are tested in 
vitro in cultures of epithelial celts to determine if 
they inhibit simulate, or modulate the effects of EGF 
30 on these cultures. Among the proteins so obtained, 
some may be utilized for the chemotherapeutic 
treatment of epitheliomas. The activities of the 
proteins thus obtained can be imQrov96 by 
mutation of the DNA coding for the proteins, in 
35 ways analogous to those described ebove. A variant 
of this procedure consists in purifying these 
stochastic peptides, polypeptides or proteins, which 
can be used as vaccines or more generally to confer 
an Immunity against a pathogenic agent or to 
40 exercise other affects on the immunological system, 
for example, to create a tolerance or diminish 
hypersensitivity with respect to a ghren antigen. In 
particular due to binding of these peptides, 
polypeptides or proteins with the antibodies 
45 directed against this antigen, h is dear that K is 
possible to use such peptides, polypeptides or 
proteins in vitro as well as in vtfo. 

More pradsely, in the •nsemble of novel proteins 
which react with the antibodies against a given 
SO entigen X, each has at least one epitope in common 
with X, thus the ensemble has an ensemble of 
epitopes in common with X. This permits utilization 
of the ensemble or sut>-ensemble as a vaccine to 
confer immunity against X- It is, for example, easy to 
55 purify one or several of the capsid proteins of the 
hepatitis B virus. These proteins can then be 
injected into an animal, for exemple, a rabbit and 
the antibodies corresponding to the initial antigen 
can be recovered by efftnlty column purification. 
60 These antibodies may be used, es described above, 
to identify dones produdng at least one protein 
having an epitope resembling at (east one of the 
epitopes of the initial antigen. After purification, 
these proteins are used as antigens (either alone or 
in rnmhinatmnl with thP »im nf rnnf^rrino 



protection against hepatitis B. The final production 
of the vaccine does not require further eccess to the 
initial pathogenic agent. 
Note that during the description of the 

70 procedures above, a number of means to achieve 
selection orscreenlng have been described. All 
these procedure may require the purification of a 
particular protein from a transformed clone. These 
protein purifications can be carried out by 

75 established procedures and utilize, in particular, the 
techniques of gel chromatography, by ion 
exchange, and by affinity chromatography, in 
addition, the proteins generated by the stochastic 
genes can have been cloned in the form of hybrid 

80 proteins haN Ing, for example, e sequence of the ^• 
galactosidase enzyme which permits affinity 
chromatography against ant^-^-galactosidase 
antibodies, and allows the subsequent deavage of 
the hybrid part (that is to say, allowing separation of 

85 the novel part and the bacterial pan of the hybrid 
protein. Below we describe the principles and 
procedures for selection of peptides or polvpeptides 
.and the corresponding genes, according to a second 
method of screenino or selection based on the 

90 . detect ion n^*^^ *>apa/.i«Y nf thft se peptides or 
polvoeptides to catalyse a specific reaction. 

As a concrete and rion limiting example, 
screening or selection in the particular case of 
proteins capable of catalysing the deavage of 

95 lactose, normally a function fulfilled by enzyme 
gatactosidase 0-gat) will be described. 

As above described, the first step of the process 
consists in generating a vary large ensemble of 
expression vectors, each expressing a distinct novel 
100 protein. To t>e concrete, for example, one may 
choose the pUC6 expression vector with cloning of 
stochastic sequences of ONA in the PstI restriction 
site. The plasmids thus obtained are then 
introduced in a clone of E coli from whose genome 

105 the natural gene for p-galactosidase, Z, and a 
second gene EBG, unrelated to the first but able to 
mutate towards p-gaf function, hsve both been 
eliminated by known genetic methods. Such host 
cells (Z*, EBG*) are not able by themselves to 

110 catalyse lactose hydrolysis, and as a consequence to 
use lactose as carbon source for growth. This 
permits utilization of such host dones for screening 
Of selection for ^-gal function. 
A convenient biological assay to analyse 

115 transformed E coli clones for those which have 
novel genes expressing e ^-gal function consists in 
the culture of bacteria transformed as descritwd in 
petri dishes containing X*gal in the medium. In this 
case, all bacterial colonies expressing a ^-gal 

120 function are visualized as blue colonies. By using 
such a biological assay, it is possible to detect even 
weak catalytic activity. The specific activity of 
characteristic enzymes ranges from 10 and 10.000 
product molecules per second. 

1 25 Supposing that a protein synthesized by a 

stochastic gene has a weak specific activity, on the 
order of one molecule per 100 seconds, it remains 
possible to detect such catalytic activity. In a petri 
dish containing X-gal in the medium, and ir> the 

130 oresence of the non metabolizable inducer IPTG 
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(i$opropyi-D-thiogalactoside) visualization of a blue 
region requires cleavage of about 10*^ to 10** 
molecules of X-gal per square millimeter. A 
bacterial colony expressing a weak enzyme and 
occupying a surface area of 1 mm' has about 10' to 
10* cells, tf each cell has or^ly one copy of the weak 
enzyme, each cell would need to catalyse cleavage 
of between lO'OOO and 100 of x-gal to be detected, 
which would require between 2.7 and 270 hours. 
Since under selective conditions it is possible to 
amplify the number of copies of each plasmid per 
cell to 5 to 20 copies per cell, or even to 100 to 1000, 
and because up to 10% of the protein of the cell can 
be specified by the new gene, the duration needed 
to detect a colony blue in the case of 100 enzyme 
molecules of wealc activity per cell Is on the order of 
0.27 to 2.7 hours. 

As a consequence of these facts, screening a very 
large number of independent bacterial colonies, 
each expressing a different novel gene, and using 
the capacity to express a p-gal function as the 
selection criterion, is fully feasible. It is possible to 
carry out screening of about 2000 colonies in one 
Petri dish of 10 cm diameter. Thus, about 20 million 
25 colonies can bt screened on a sheet of X gal agar of 
1 square meter. 

It is to be noted that bactena} colonies which 
appear blue on X gal Petri dishes might be false 
positives due to a mutation in the bacterial genome 
30 which confers on K the capacity to metabolize 
lactose, or for other reasons than those which resuft 
from a catalytic activity of the novel protein 
expressed by the cells of the colony. Such false 
positives can be directly eliminated by purifying the 
DNA of the expression vector from the positive 
colony, and retransforming Z~, EBG'E coli host 
cells, if the 3-gal activity is due to the novel protein 
coded by the rtew gene in the expression vector, all 
those cells transformed by that vector will exhibit ^ 
40 ga' function. In contrast, if the initial blue colony is 
due to a mutation in the genome of the host cell, It is 
a rare event and independent of the transformation, 
thus the number of cells of the new clone of 
transformed E coli capable of expressing gal 
45 function will be small or zero. 

The power of mass simuftaneous purification of 
all the expression vectors from all the positive 
clones (blue) followed by retransformation of naive 
bacteria should be stressed. Suppose that the aim is 
50 to carry out a screening to select proteins having a 
catalytic function, and that the probability that a new 
peptide or polypeptide carries out this function at 
feast weatcfy is 10'*, while the probability that a 
clone of the E coli bacterial host undergoes a 
mutation rendering it capable of carrying out the 
same function is 10'*, then it can be calculated that 
among 20 million transformed bacteria which are 
screened, 20 positive clones will be attributable to 
the novel genes in expresaion vectors which each 
carries, while 200 positive clones %vilt be the result of 
genomic mutation. Mass purification of the 
expression vectors from the total of 220 positive 
bacterial clones followed by retransformation of 
naive bacteria with the mixture of these expression 
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clones consisting of all those bacteria transforn>ed 
with the 20 expression vectors which code for the 
novel proteins having the desired function, and a 
very small numt)er of bacterial clones resulting from 

70 genomic mutations and containing the 200 

expression vectors which are not of interes t^ Assai l 
number of gypt^^ »^ pMriftz-atipn Of g?^P''ggfffg'1 
vectors from nositiw hartpriai miftnip^i, fftUg***^^ ^Y 
such retransformation. allows the detection of verv 

75 rare expression vectors trulv positive for a desirfd 
catalytic activity, d espite a high background rata of 
mutations in the host cells for the same function. 

Following screening operations of this type, it is 
possible to purify the new protein by established 

80 techniques. The production of that protein in large 
quantity is made possible by the fact that 
identification of the useful protein occurs together 
with simuftaneous identification of the gene coding 
for the same protein. Consequently, either the same 

85 expression vector can be used, or the novel gene 
can t>e transplanted into a more appropriated 
expression vector for its synthesis and isolation in 
large quantity. 
It is feasible to apply this method of screening for 

90 any enzymatic function for which an appropriate 
biological assay exists. For such screenings, it is not 
necessary that the enzymatic function which is 
sought be useful to the host cell. It is possible to 
eanyf out screenings not only for an enzymatic 

95 function but for any other desired property for 
which it is possible to establish an appropriate 
biological assay. It is thus feasible to carry out. even 
in the simple case of 3^al function visualized on an 
X-gal Petri plate, a screening of on the order of 100 
100 million, or even a billion novel genes for a catalytic 
activity or any other desired property. 

Selection Of Transfomted Host Cells 
On the other hand, it is possible to use selection 

105 techniques for eny property, catalytic or otherwise, 
where the presence or absence of the property can 
be rendered essential for the survival of the host 
cells containing the expression vectors which code 
for the novel genes, or also can t>e used to select for 

110 those viruses coding and expressing the desired 
novel gene. As a non-limiting, but concrete 
example, selection for ^-galactos^d8se function 
shall t>e described. An appropriate clone of Z'EBG' 
E. coK is not able to grow on lactose as the sole 

115 carbon source. Thus, after carrying out the first step 
described above, it is possible to culture a very large 
number of host cells transformed by the expression 
vectors coding for the novel genes, under selective 
conditions, either by progressive diminution of 

1 20 other sources of carbon, or utilization of lactose 
alone from the start. During the course of such 
selection, in vivo mutagenesis by recombination, or 
by explicity recovering the expression vectors and 
mutagenizing their novel genes in vitro by various 

125 mutagens, or by any other common technique, 
permits adaptive improvements in the capacity to 
fulfil the deslrea catalytic function. When both 
selection techniques and convenient bioassay 
techniques exist at the same time, as in the present 

1 !90 case, it is Dossible to use selection techniques 
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initially to enrich the representation of host bacteria 
expressing the 3-gal function, then carry out a 
screening on a Petri plate on X-gel medium to 
establish efficiently which are the positive cells. In 
5 (the absence of convenient bioassays, application of 
y progressively stricter selection is the easiest route to 
j purify one or a small numt>er of distinct host cells 
/ whose expression vectors code for the proteins 
Vjcetalysing the desired reaction. 
10 It is possible to utilize these techniques to find 
novel proteins havirig a large variety of structural 
and functional characteristics beyond the capacity 
to catalyse a specific reaction. For example, H is 
possible to carry out a screen or select for novel 
15 proteins which bind to cls-regulatory sites on the 
DN A and thereby block the expression of one of the 
host cell's functions, or block transcription of the 
ONA, stimulate transcription, etc. 
For example, in the case of E coii, a clone mutant 
20 in the repressor of the lactose operonO-) expresses 
P-Qal function constitutivefy due to the feet the 
lectose operator is not repressed. All cells of this 
type produce blue clones on Petri plates containing 
X-gal medium. It is possible to transform such host 
25 cells with expression vectors synthesizing novel 
proteins and carry out a screen on X-gal Petri plates 
in order to detect those clones which are not blue. 
Among those, some represent the case where the 
new protein binds to the lactose operator and 
30 represses the synthesis of p-gal. It Is then feasible to 
mass isolate such plasmids, retransform, isolate 
those clones which do not produce 3-gal. and 
thereafter carry out a detailed verification. 

"^ftntinp*'* rhft p Tpcess can be utilized 
'^K in ftrriftr rrftM« fhmn ienU^«„ pfft nnly exploitable 
proteins, htit alsft RNA ^pH DNA m^t products in 
themselves, having expfoitable oroDerttes. This 
A results from the fact that, on one hand, the 
\ procedure consists in creating stochastic sequences 
Ah of DNA which may interact directly with other 

cellular or biochemical constituents, and on the 
^ other hand, these sequences cloned in expression 
/ vectors are transcribed into RNA which are 
/ themselves capable of muttiple biochemical 
4x.interactton8. 

An Example Of The Us e Of The Procedure To Create 
And Select For A tPNA? Which Is Useful In Itself. 
This example illustrates selection for a useful 

50 DNA. and the purification and study of the 

mechanism of action of regulatory proteins which 
bind to the DNA. Consider a preparation of the 
oestradiol receptor, a protein obtained bf standard 
techniques. In the presence of oestradiol, a steroid 

55 sexuel hormone, the receptor changes 

conformation and binds tightly to certain specific 
sequences in the genomic DNA. thus affecting the 
transcription of genes implicated in sexual 
differentiation and.the control of fertility J|y 

60 incubattng i a mixture e ontainino oagtryi^ior tta, 

r^roptnr »nri a \^rQm niimhaf nf rt Wmnf etnrhastin 

DNA sequences inserted in the irjvectors. followed. 
by filtration of the mi>rtiire acro **^ a nttro caiiulose. 
membrane, one has a direct selection f pr 
65 stochastic DNA sequences binding to the oestrooen- 



receptor coniplex. where only thnsg DNAs bound ^9 
a protein are retained by the membrane. After 
washing and elution.the DNA liberated from the 
membrane i s utilized as such to transform bacteria. 

70 After culture of the transformed bacteria, the vectors 
which they contain are again purif iedandjuULUL. 
several cvcles of incubation, filtration and, 
transformation are carried out as describedabove^ 
These procedures allow the isolation of stochastic 

75 sequences of DNA having an elevated affinity for the 
oestradiol-receptor complex. Such sequences are 
open to numerous diagnostic and pharmacologic 
applications, in particular, for developing synthetic 
oestrogens for the control of fertility and treatment 

80 of sterility. 

Creation And Selection Of An RNA Useful In hself 
Let there be a large number of stochastic DNA 
sequences, produced as has been described and 

85 cloned in an expression vector. It follows that the 
RNA transcribed from these sequences in the 
transformed host celts can be useful products 
themselves. As a non limiting example, it is possible 
to select a stochastic gene coding for e suppressor 

90 transfer RNA (tRNA) by the following procedure:j^ 
lame number 10*1 of stochf ^t'? m^^nf^* 
transformed into competent bacterial hosts carrying 
,a "nonsens e" mutatio n in the arq E. flene. These 
transformed bacteria are plated on minimal medium 

95 without arginine and with the selective antibiotic for 
that plasmid (ampicillin if the vector is pUC8). Only 
those trensformed bacteria which have tiecome 
capable of synthesizing erginine will be able to 
grow. This phenotype can result either from a back 
1 00 mutation of the host genome, or the presence in the 
celt of a suppressor. It is easy to test each 
transformed colony to determine if the arg4 
phenotype is or is not due to the presence of the 
stochastic gene in its vector; it suffices to purify the 
105 plasmid from this colony and verify that it confers 
an Arg+ phenotype on all arge E cells transformed 
byit 

Selection Of Proteins Capable Of Catalysing A 

110 Sequertce Of Reactions 

Below we describe another means of selection, 
open to independent applications, based on the 
principle of simultaneous and parallel selection of a 
certain numt>er of novel proteins capable of 

115 catalysing a connected sequence of reactions. 
The basic idea of this method Is the following: 
given an initial ensemble of chemical compounds 
considered as building blocks or elements of 
construction from which it is hoped to synthetize 

1 20 one or several desired chemical compounds by 
means of a catalysed sequence of chemical 
reactions, there exists a very large number of 
reaction routes which can be partially or completely 
8Ut»stituted for one another, which are all 

125 thermodynamically possible, and which lead from 
the set of building blocks to the desired Urget 
oompound(s). Effident synthesis of a target 
compound is favored if each step of at least one 
reaction pathway leading from the building block 

1 30 compounds to the target compound is comprised of 
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reactions each of which is catalysed. On the other 
hand, it is relatively leas important to determine 
which among the many independent or partially 
independent reaction pathways is catalysed. In the 

5 previous description, we have shown how it is 
possible to obtain a very large number of host cells 
each of which expresses a distinct novel protein. 

Each of these novel proteins is a candidate to 
catalyse one or another of the possible reactions, in 

10 the set of all the possible reactions leading from the 
ensemble of building blocks to the target 
compound. If a sufTiciently large number of 
ttochastic proteins is present in a reaction mixture 
containing the building block compounds, such that 

15 a sufficiently large number of the possible reactions 
are catalysed, there is a high probability that one 
connected sequence of reactions leading from the 
set of building block compound to the target 
compound will be catalysed by a subset of the novel 

20 proteins, ft Is clear that this procedure can be 
extended to the catalysis not only of one* but of 
several target compounds simultaneously. 

Based on this principte it is possible to proceed as 
follows In order to select In parallel a set of novel 

25 proteins catalysing a desired sequence of chemical 
reactions: 

1. Specify the desired set of compounds 
constituting the building blocks, utilizing 
preferentially a reasonably large number of distinct 

30 chemical species in order to increase the number of 
potential concurrent reactions leading to the desired 
target compound. 

2. Using an appropriate volume of reaction 
medium, add a very large number of novel 

35 stochastic proteins isolated from transformed or 
transfected cells tynthesiting these proteins. Cerry 
out an assay to determine if the target compound is 
formed. H it is, confirm that this formation requires 
the presence of the mixture of novel proteins, ff so, 

40 then the mixture should contain a subset of proteins 
catalysing one or several reaction pathways leading 
from the building block set to the target compound. 
Purify and divide the initial ensemble of clones 
which synthesize the set of novel stochastic 

45 protelna, the subset which is required to catalyse the 
sequence of reactions leading to the target 
compound. 

More precisely, as a non limiting example, below 
we describe selection of novel proteins capable of 

50 catalysing the synthesis of a specific small peptide, 
in particular, e pentapeptide, starting from a 
building block set constituted of smaller peptides 
and amino acids. All peptides are constituted by a 
linear sequence of 20 different types Of amino acids. 

55 oriented from the amino to the carboxy terminus. 
Any peptide can be formed in a single step by the 
terminal condensation of two smaller peptides lor of 
two amino acids), or by hydrolysis of a larger 
peptide. A peptide with M residues can thus be 

60 formed by M*1 condensation reactions. The number 
of reactions. R. by which a set of peptides having 
length 1, 2, 3, ... M residues can be interconverted 
is larger than the number of possible moleeuter 
species, T This can be expressed as R/TsM-2. Thus, 



large number of independent or partially 
independent reaction pathways lead to the 
synthesis of a specific target peptide. Choose a 
pentapeptide whose presence can be determined 

70 conveniently by some common assay technique for 
example HPLC (liquid phase high pressure 
chromatography), paper chromatography, etc. 
Formafion of a peptide bond requires energy in a 
dilute aqueous medium, but if the peptides 

75 participating in the condensation reactions are 
edequately concentrated, formation of peptide 
bonds is thermodynamically favored over 
ISydrolysts and occurs efTiciently in the presence of 
an appropriate enzymatic catalyst, for example 

80 pepsin or trypsin, without requiring the presence of 
ATP or other high energy compounds. Such a 
reaction mixture of small peptides whose amino 
acids are marked radioaciively to act as tracers with 
•H, ^•C, ••S, constituting the building block set can 

85 be used at sufficiently high concentrations to lead to 
condensation reactions. 

For example, it is feasible to proceed as follows: 
IS mg of each amino acid and small peptides having 
2 to 4 amino acids, chosen to constitute the building 

90 block set are dissolved in a volume of 0,25 ml to 1 .0 
ml of a 0,1 M pH 7*6 phosphate buffer. A large 
number of novel proteins, generated end isolated as 
described above are purified from their bacterial or 
other host cells. The mixture of these novel proteins 

95 is dissolved to a final concentration on the order of 
0.8 to 1 ,0 mg/ml in the same buffer. 0,25 ml to 0,5 ml 
of the protein mixture is added to the mixture of 
building blocks. This is incubated at 25*C to 40'C for 
1 to 40 hours. Aliquots of 6 Ml are removed at regular 

1 00 intervals, the first is used as a "blank" and taken 
twfore addition of the mixtuns of novel proteins. 
These aliquots are analysed by chromatography 
using n*butano)-acetic acid>pyridine-water 
(30:6:20:24 by volume) es the solvent. The 

105 chromatogram is dried and analysed by ninhydrin 
or autoradiography (with or without intensifying 
screens). Because the compound constituting the 
building block set are radioactively marked, the 
target compound wilt be radioactive and it will have 

1 10 specific activity high enough to permit detection at 
the lavtl of 1—10 ng. In place of standard 
chromatographic analysis, it is possible to use HPLC 
(high pressure liquid chromatography) which is 
faster end simpler to carry out. More generally, all 

115 the usual analytic procedures can be employed. 
Consequently it is possible to detect a yield of the 
target compound of less than one pert per million by 
weight compared to the compounds used as initial 
building blocks. 

1 20 If the pentapeptide is formed in the conditions 
described ebove. but not when an extract is utilized 
which Is derived from host cells transformed by an 
expreasion vector containing no stochastic genes, 
the formation of the pentapeptide is not the result of 

125 bacterial contaminants and thus requires the 
presence of e subset of the novel proteins in the 
reaction mixture. 

The following step consists in the separation of 
the perticular subset of cells which contain 
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catalysing thf sequence of reactions leading to the 
target psntapeptide. As an exannple, H the number 
of reactions forming this sequence is 5. there are 
about 5 novel proteins which catalyse the necessary 

5 reactions. If the done bank of bacteria containing 
the expression vectors which code for the novel 
genes has a number of distinct novel genes which is 
on the order of 1 'OOO'OOO all these expression 
vectors are isolated en masse and retransformed 

10 into 100 distinct sets of 10* bacteria at a ratio of 
vectors to bacteria v^ich is sufficiently low that on 
average, the number of bacteria in each set which 
are transformed is about half the number of initial 
genes, i.e. about 500*000. Thus, the probability that 

15 any given one of the 100 sets of bacteria contains 
the entire set of 5 critical novel proteins is (1/2)*«1/ 
32. Among the 100 initial sets of bacteria, about 3 
will contain the 5 critical transformants. In each of 
these sets, the total number of new genes is only 

20 500*000 ratherthanVOOO'OOO. By successive 
repetitions the total number of which Is atwut 20 in 
the present case, this procedure isolates the S 
critical novel genes. Following this, mutagenesis 
and selection on this set of 5 stochastic genes allows 

25 improvement of the necessary catalytic functions. In 
a case where it is necessery to catalyse a sequence 
of 20 reactions and 20 genes coding novel proialM 
need to be isolated in parallel, it suffices to adjust 
the multiplicity of transformation such that each set 

30 of 1 0* bacteria receives B0% of the 10* stochastic 
genes, and to use 200 such sets of bacteria. The 
probability that all 20 novel proteins ere found in 
one set is 0.8^sO,01S. Thus, about 2 among the 200 
sets will have the 20 novel genes which are needed 

35 to catalyse the formation of the target compound. 
The number of cycles required for isolation of the 20 
novel genes is on the order of 30. 

The principles and procedures descrifc>ed above 
generalize from the ease of peptides to numerous 

40 areas of chemistry in which Chemicet reactions take 
place in aqueoua m^ium, in temperature, pti end 
concentration conditions which permit general 
enzymatic function. In each case H is necessary to 
make use of an assay method to detect the 

45 formation of the desired target compound(s). It Is 
also necessary to choose a sufficiently large number 
of building block compounds to augment the 
number of reaction sequences which lead to the 
target compound. 

50 The concrete example which was given for the 
synthesis of a target pentapeptlde can also be 
generalized as follows: 

The procedure as described, generates among 
other products, stochastic peptides and proteins. 

55 These peptides or proteins can act. catatytically or In 
other ways, on other compounds. They can equally 
constitute the substrates on which they act. Thus, It 
is possible to select (or screen) for the capacity of 
such stochastic peptides or proteins to interact 

60 among themselves and thereby modify the 

conformation, the structure or the function of some 
among them. Simflariy, it is possible to select (or 
screen) for the capacity of these peptides and 
proteins to catalyse among themselves, hydrolysis. 



modifying the peptides. For example, the hydrolysis 
of a given stochastic peptide by at least one member 
of the set of stochastic peptides and proteins can be 
followed and measured by radioactive marking of 

70 the given protein followed by an incubation with a 
mixture of the stochastic proteins in the presence of 
ions such as Mg, Ca. Zn, Fe and ATP or GTP. The 
appearance of radioactive fragments of the marked 
protein is then measured as described. The 

75 stochastic protein(s) which catalyse this reaction 
can again be isolated* along with the genets) 
producing them, by sequential diminution of the 
library of transformed clones, as described above. 
An extension of the procedure consists in the 

80 selection of an ensemble of stochastic peptides and 
polypeptides capable of catalysing a set of reactions 
leading from the initial building blocks (amino acids 
and small peptides) to some of the peptides or 
polypeptides of the set. It is therefore also possible 

85 to select an ensemble capable of catalysing its own 
synthesis; such a reflexively autocatalytic set can t>e 
established in a chemostat where the products of 
the reactions are constantly diluted, but where the 
concentration of the building blocks is maintained 

90 constant Attematively, synthesis of such a set is 
aided by enclosing the complex set of peptides in 
liposomes by standard techniques. In a hypertonic 
equeous environment surrounding such liposomes, 
condensation reaction forming larger peptides 

95 lowers the osmotic pressure inside the liposomes, 
. drives water molecules produced by the 
condensation reactions out of the liposomes, hence 
favors synthesis of larger polymers. Existence of 
such en autocatalytic ensemble can be verified by 
100 two dimensional gel electrophoresis and by HPLC. 
showing the synthesis of a stable distribution of 
peptides and polypeptides. The appropriate reaction 
volume depends on the numt>er of molecular 
species used, and the concentrations necessary to 
1 05 favor the formation of peptide bonds over their 
hydrolysis. The distribution of molecular species of 
en eutocatatytic ensemble Is free to vary or change 
due to the emergence of veriant autocatalytic 
ensembles. The peptides and polypeptides which 

1 10 constitute an autocatalytic set may have certain 
elements In common with the targe initial ensemble 
(constituted of coded peptides and polypeptides as 
ghren by our procedure) but can also contain 
peptides and polypeptides which are not coded by 

115 the ensemble of stochastic genes coding for the 
initial ensemble. 

The set of stochastic genes whose products are 
necessary to establish such an autocatalytic set can 
be isolated as has been described, by sequential 

120 diminution of the library of transformed clones. In 
addition, an autocatalytic set can contain coded 
peptides initially coded by the stochastic genes and 
synthesized continuously in the autocatalytic set To 
isolate this coded subset of peptides and proteins, 

125 the autocatytic set can be used to obtain, through 
Immunization in an animal, polyclonal sera 
recognizing a very large number of constituents of 
the autocatalytic set. 
These sera can be utilized to screen the library of 
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proteins able to combine with the antibodies 
present fn the sera. 

This set of stochastic genes expresses a large 
number of coded stochastic proteins which persist 
5 in the autocatalytic set. The remainder of the coded 
constituents of such an autocatalytic set can be 
isolated by serial diminution of the library of 
stochastic genes, from which the subset detected by 
immunological methods has first been subtracted. 
10 Such autocatalytic sets of peptides and proteins* 
obtained as noted, may find a number of practical 
applications. 

CLAIMS 

15 Q) Process for the production o f peptides or, 
polypeptides by microbiological means, 
characterized in that genes which are at least 
partially composed of stochastic synthetic 
polynucleotides are produced simultaneously in a 

20 common milieu, that the genes thus obtained are 
introduced into host cells, that the independent 
clones of the transformed host cells containing 
these genes are simultaneously cultivated so aa to 
clone the stochastic genes end lead to the 

25 production of proteins expressed by each of these 
stochastic genes, that screening and/or selection is 
carried out on such clones of transformed host cells 
to identify those clones producing peptides or 
polypeptides having at least one specified property. 

30 that the clones so identified are isolated, then grown 
in a manner so as to produce at least one peptide or 
polypeptide having the said property. 

2. Process according to claim 1, characterized by 
the fact that the genes are produced by stochastic 

35 copolymerization of the four types of 

deoxyphosphonucleotides A, C, G and T, starting 
from the two extremities of an expression vector 
which was previously linearized, then by formation 
of cohesive extremities to create a first strand of 

40 . stochastic DNA constituted of a molecule of 
expression vector possessing two stochastic 
sequences whose 3' extremities are 
comptementery, followed by synthesis of the 
second strand of the stochastic DNA. 

45 3- Process eccording to claim 1 . characterized by 
the fact that the genes are produced by stochastic 
copolymerization of double stranded 
oligonucleotides which do not have cohesive ends. 
In a manner so as to form fragments of stochastic 

50 DNA, followed by ligation of these fragments in an 
expression vector which was previously Hnearized. 

4. Process according to claim 2 or 3. characterized 
by the fact that the expression vector is a plasmid. 

5. Process according to claim 4, characterized by 
55 the fact that the expression vector is pUC8. 

6. Process eccording to claim 2 or claim 3, 
characterized by the fact that expression vector is a 
fragment of viral DNA. 

7. Process according to claim 2 or claim 3, 

60 characterized by the fact that the expression vector 
is a hybrid of plasmid and viral DNA. 

8. Process according to claim 1 to 6, characterized 
by the fact that the host cells are prokaryotic cells. 

9. Process according to claims 1 to 7 characterized 



. 10. Process according to claim 8, characterized by 
the fact that the cells are chosen among HB101 and 
C600. 

1 1 . Process according to claim 3, characterized by 
70 the fact that the oligonucleotides form a group of 

palindromic octamers. 

12. Process according to claim 11 characterized by 
the fact that the group of palindromic octamers is 
the following group: 

75 

5' GGAATTCCS' 
5' GGTCGACC3' 
S'CAAGCTTGr 
5'CCATATGGr 
80 5'CATCGATG3' 

13. Process according to claim 3, characterized by 
the fact that the oligonucleotides form a group of 
palindromic heptamers. 

14. Process according to claim 13, characterized 
by the fact that the group of palindromic heptamers 
is the following group: 

6'XTCGCGA3' 
S'XCTGCAGT 
90 5'RGGTACC3' 

where X«A, G, C, orT, and R=AorT. 

15. Process according to claim 4 and one of the 
dalm 12 or 14, charecterized by the fact that one first 

95 isolates and purifies the transforming DNA derived 
from a culture of independent dones of transformed 
host cells obtained by proceeding in the manner 
specified in claim 11 or in daim 13, then that one 
cuts the purified DNA tiy mesns of at least one 

1 00 restriction enzyme which corresponds to a specific 
restriction site present in these palindromic 
octamers or heptamers, but absent from the 
expression vector being \jtilized,that one thereafter 
simultaneously treats the ensemble of linearized 

105 stochastic DNA fragments so obtained by T4 DNA 
ligase in such a manner as to create a new ensemble 
of DNA containing new stochestic sequences, end 
thst one uses this new ensemble of transforming 
DNA to transform host cells and clone such genes, 

110 and finally that one carries out screening and/or 
selection and isolates the new clones of 
transformed calls and that one grows these so as to 
produce at least one peptide or polypeptide having 
a desired property. 

115 16. Process according to claim 1 , characterized by 
the fact that the said property is the capacity to 
catalyse a given chemical reaction. 

17. Process according to claim 1 for the 
production of several peptides end/or polypeptides, 

1 20 characterized by the feet that the said property is the 
capacity to catalyse e sequence of reactions leading 
from a given group of initial chemical compounds to 
at least one target compound. 

18. Process according to claim 16 for the 

1 25 production of an ensemble consisting of more than 
one peptide end/or polypeptide which is reflexively 
eutocatalytic, characterized by the fact that the said 
property is the capacity to catalyse the synthesis of 
the ensemble itself starting from amino acids and/or 
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19. Process according to claim 1, characterized ijy 
the fact that the said property is the capacity to 
modify selectively the chemical and/or biological 
properties of a given compound. 
$ 20. Process according to claim 19. characterized 
by the fact that the said property is the capacity to 
modify selectively the catalytic activity of a 
polypeptide. 

21. Process according to dalm 19^ characterized 
10 by the fact that the said property is the capacity to 

simulate or modify at least one biological function 
of at least one biologically active compound. 

22. Process according to claim 21, characterized 
by the fact that the said biologically active 

15 compound is chosen among the hormones, 

neurotransmitters, adhesion or growth factors, and 
specific regulators or replication and/or 
transcription of ONA. and/or translation of RNA. 

23. Process according to claim 1. characterized by 
20 the fact that the said property it the capacity to bind 

to a given tigand. 

24. Utilization of a peptide or polypeptide 
obtained by the process according to daim 23for 
the detection and/or titration of the Ifgand. 

25 25. Process according to claim 1 » characterized by 
the fact that the property is to have at least one 
epitope similar to one of the epitopes of a given 
antigen. 

26. Process according to daim 19 and 25. 

30 characterized by the fact that the said property (s the 
capadty to simulate or modify the effects of a 
biologically active molecule, end that screening and/ 
or selection of the clones of transformed host cells 
producing at least one peptide or polypeptide 

35 having this property is carried out by preparing 
antibodies against that molecule, and utilizing these 
antibodies so obtained to identify those dones 
containing those peptides or polypeptides, than by 
growing the clones thus identified and separating 

40 and purifying the peptide or polypeptide produced 
by these dones, and finally by submitting these 
peptidels) or polypeptide(s) to an assay in vitro to 
verify that it has in fact the capadty to simulate or 
modify the effects of the said molecule. 

45 27. Peptides or polypeptides obuined by the 
process according to daim 1 or daim 26, utilizable 
as active substances having a pharmacolooic and/or 
chemotherapeutic action. 
28. Peptides or polypeptides, obtained by the 

50 process according to daim 25, utilizabie to diminish, 
in vitro or in vivo, the concentration of free 
antibodies^ specific for the said antigen, by 
formation of bonds between these peptides or 
polypeptides and these antibodies. 

55 29. Peptides or polypeptides according to claims 
27 or 28. utilizable as agents to suppress 
immunological hypersensHivity. 

30. Peptides or polypeptides, obtained by the 
process eccording to claim 25. utilizable as agents to 

60 create tolerence with respect to the said antigen. 

31. Process according to daim 25. characterized 
by the fact that the antigen is EGF. 

32. Peptides or polypeptides obtained by the 
orocess accordina to claim 31. utilizable for the 



33. Process according to claim 1, characterized by 
the fact that clones of transformed host cells 
producing peptides or polypeptides having the 
desired property are identified and isolated by 

70 affinity chromatography on antibodies 

corresponding to a protein expressed by the natural 
fragment of the ONA hybrid. 

34. Process according to claim 33. characterized 
by the fact that the natural f ragmentof the ONA 

75 hybrid contains a gene expressing 3-galactosidase. 
and that one identifies and isolates the said dones 
of transformed cell hosts by affinity 
chromatography with anti•^-galactosidase 
antibodies. 

80 35. Process according to claim 1 , or claim 34. 
characterized by the fact that after expression and 
purification of the hybrid peptides or polypeptides, 
the novel fragments are ttparated and isolated. 

36. Application of the process according to claim 
85 25 or daim 26, for the preparation of a vaccine. 

characterized by the fact that antibodies against a 
pathogenic agent are obtained and used to identify 
those dones producing at least one protein having 
at least one epitope similar to one of the epitopes of 

90 the pathogenic agent that the corresponding clones 
of transformed host cells are grown in such a 
manner as to produce this protein, that the protein is 
isolated and purified from the cultures of dones of 
cells and that this protein is used for the produdion 

95 of a vacdne against the pathogenic agent. 

37. Application according to daim 36. for the 
preparation of an anti-HVB vacdne. characterized by 
the fact that at least one capsid protein of the HVB 
virus is extracted and purified, that this protein is 

100 injected into the body of an animal capable of 
forming antibodies against this protein, that these 
antibodies are recovered and purified, that these 
antibodies are used to identrfy those dones 
producing at least one protein heving at least one 

1 05 epitope similar to one of the epitopes of the HVB 
virus, thet the dones of trensf ormed host cells 
corresponding to these dones are grown in a 
manrwrto produce this protein, that this protein is 
Isolated and purified from these cultures of host 

1 10 cells, end that this protein Is used for the production 
of an anti-HVB vaccine. 

38. Process according tm daim 1, characterized by 
the fact that the host cells consist in bacteria of 
Escherichia coll type, whose genome contains 

115 neither the natural p-galactosi dase gene, nor the 
EBG gene, that is Z', tGB' E coli. snd that these 
transformed host cells are cultured in an X-gal 
medium also containing the inducer IPTG, that 
dones which ere positive for ^-galactosidase 

120 function ere detected in the culture milieu, that 
thereafter this ONA is transplanted to a clone of host 
cells appropriate for industrial production of at least 
one peptide, polypeptide or protein with ^ 
gaiactosidase function. 

125 39. Process according to daim 1, charaderized by 
the fact that the seid property is the capacity to bind 
to a given compound. 

40. Process eccording to daim 39, characterijed 
by the fact that the said compound is chosen among 
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41 . Process according to claim 40. charaeterixed 
by tha fact that the said proteins are proteins 
regulating the transcription activity or replication of 
DNA. 

5 42. Process according to dairn 39» eharacterixed 
t>y the fact that the said compound is chosen among 
the sequences of DNA and RNA. 

43. Proteins obtained by the process according to 
claim 40 or claim 42. 

1 0 (^)Process for the production of DN A» 

characterized by the fact that, in the same milieu, 
genes which are at least partially composed of 
stochastic synthetic polynucleotides are produced, 
that the genes so oroduced Csre -introduced into host 

15 . fiells in a manner to produce an ensemble of 

nflfnr"^'*'^ '^\}% f^m^rm K 

_ oroduce independent clones of the host cells so 
jroduoedp that scre f ning *n(y pr selection is carried 
out on this ensflmble ;p Identify those host cells 
20 which contain those stochastic sequences of DNA 
havin g at least one de rirad property, end that such 
DNA Is isolated from the identified cultures of the 
hostcells^ 

45. Process according to claim 44, characterized 
25 bythefactthatthe said property isthe capacity to 

bind to given compound. 

46. Process according to claim 45. characterized 
by the fact that the said compound is chosen among 
the peptides, polypeptides and proteins. 

30 47. Process according to daim 45, characterized 
by the fact that the said compound is a compound 
regulating the transcription activity or the 
replication Of DNA. 
48. Process according to daim 47. characterized 

35 by the fact that the aald compound is a regulatory 
protein controlling the transcription or replication of 
DNA. 



49. Utilization of a sequence of DNA obtained by 
the process according to claim 46, or claim 47, as a 

40 ds-regulatory sequence of replication or 

transcription of a neighboring sequence of DNA. 

50. Process according to claim 42, characterized 
by the fact that the proteins obtained have the 
eapadty to modify the transcription activity, the 

45 replication, or the stability of DNA. 

51. Utilization of a protein obtained by the process 
according to daim 48. to modify the transcription, 
replication or stability of a sequence of DNA in a cell 
containing this sequence of DNA and expressing 

50 thijyprotein. 

Process for the production of RNA, 
characterized by the fact that, in the same milieu, 
genes which are at least partially composed of 
. aynthatic stochastic polynucleotides are produced 
55 timultaneousiv . that the genes thi ** yfatain^ Ara 
Introduced in host calls in a ma nrfgr iVt PFO'*"'^'^ '^V- 



ensemble of transformed host ce Hs >thatj tte 
Independent clones of transformed nostcells so 



produced are grown simultaneouslv ^that the . 
60 screening and/or selection Is carried out on this 

onsemble in a manner to identify those host cells„ 

containing stochastic sequences of RNA having at. 

least one desired prooertv, and that the RNA is 

liolated from the cultures of host cells so identified. 
65 S3. Process according to daim 52. characterized 

by the fact that the said property is the capacity to 

bind to a given compound. 

54. Process according to dairn 52, characterized 
by the fact that the said property is the capacity to 

70 catalyse a given chemical reaction. 

55. Process according to daim 52, characterized 
by the ftet that the said property is to be a transfer 
RNA. 
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